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THE SECRETARY PROBLEM ON AN UNKNOWN POSET 

BRYN GARROD AND ROBERT MORRIS 

Abstract. We consider generalizations of the classical secretary problem, also known as the 
problem of optimal choice, to posets where the only information we have is the size of the 
poset and the number of maximal elements. We show that, given this information, there is 
an algorithm that is successful with probability at least -. We conjecture that if there are k 

maximal elements and k > 2 then this can be improved to ''^\ \, and prove this conjecture for 
C^ ' posets of width k. We also show that no better bound is possible. 

<N 

_^ . 1. Introduction 

OO, 

^^ ■ The exact origins of the classical secretary problem are complicated (and the subject of 

Ferguson's history of the problem [^), but the problem was popularized by Martin Gardner in 

^^ ■ his Scientific American column in February 1960, as the game googol. The problem itself is 

^^ ■ simple to state, and its 'secretary problem' formulation is as follows. There are n candidates to 

be interviewed for a position as a secretary. They are interviewed one by one and, after each 

interview, the interviewer must decide whether or not to accept that candidate. If the candidate 

is accepted then the process stops, and if the candidate is rejected then the interviewer moves on 

to the next candidate. The interviewer may only accept the most recently interviewed candidate. 

At each stage, the interviewer knows the complete ranking of the candidates interviewed so far, 

^^ ! all of whom are comparable, but has no other measure of their ability. The interviewer is only 

Q^ ■ interested in finding the very best candidate; selecting any other for the job is considered a 

t^^ . failure. It is well-known (see [1], for example) that the interviewer has a simple strategy that is 

^^ \ successful with probability at least -, and that there is no strategy achieving a better bound. 

Since 1960, many generalizations of the problem have been considered. One direction has 

L^ . been to consider partial orders on the candidates other than a total order. In this case, the 

interviewer knows the poset induced by the candidates interviewed so far, and wishes to choose 

a candidate who is maximal in the original poset. Morayne [13] considered the case of a full 

binary tree of depth n, and showed that the optimal strategy is to select the maximum out 

i^ of the elements seen so far when the poset induced by these elements is either linear of length 

i^ ■ greater than ^ or non-linear with a unique maximum. He showed that as n tends to infinity, 

Cd I the probability of success tends to 1. Garrod, Kubicki and Morayne [B] considered the case of 

n pairs of 'twins', where there are n levels with two incomparable elements on each level. They 

showed that the optimal strategy is to wait until a certain threshold number of levels have been 

seen and then to select the next element that is maximal and whose twin has already been 

seen. They further showed that as n tends to infinity, this threshold tends to ~ 0.4709n and the 

probability of success to ~ 0.7680. Calculating these asymptotic values for the natural extension 

to 'fc-tuplets', for k > 2, seems to be a harder problem. 
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A further interesting generalization was an attempt to find an algorithm that was successful 
on all posets of a given size with positive probability. Surprisingly, Preater [14J proved that there 
is such a 'universal' algorithm (depending only on the size of the poset), which is successful on 
every poset with probability at least g. In this algorithm, an initial random number of elements 
are rejected and a subsequent element is accepted according to randomized criteria. A slightly 
modified version of the algorithm, also suggested by Preater, was analysed by Georgiou, Kuchta, 
Morayne and Niemiec [7j, and gave an improved lower bound of | for the probability of success. 
More recently, Kozik [TU] introduced a 'dynamic threshold strategy' and showed that it was 
successful with probability at least ^ + e, for some e > and for all sufficiently large posets. 
Since the best possible probability of success in the classical secretary problem, on a totally 
ordered set, is i, the best possible lower bound for a universal algorithm must lie between j + e 
audi. 

In this paper, we show that given any poset there is an algorithm that is successful with 
probability at least -, so, in this sense, the total order is the hardest possible partial order. In 
fact, this algorithm depends only on the size of the poset and its number of maximal elements, 
so it is universal for any family where these are given. It is therefore natural to ask which is the 
hardest partial order with a given number of maximal elements. The most obvious choice is the 
poset consisting of k disjoint chains. We shall give an asymptotically sharp lower bound on the 
probability of success in the problem of optimal choice on k disjoint chains, and show that it is 
at least as hard as on any poset with k maximal elements and of width k, that is, whose largest 
antichain has size k. 

More precisely, our main aim is to prove the following two theorems. 

Theorem 1.1. Let {P,~<) be a poset with k maximal elem,ents and of width k. Then there is 
an algorithm for the secretary problem on (P, -<) that is successful with probability at least pk, 
where 

f i ifk = l 

and these are the best possible such bounds. 

We emphasize that in both the theorem above and that below, the claimed algorithm is not 
universal, but depends on both |P| and the number of maximal elements of (P, -<). 

Theorem 1.2. Let {P,~<) be a poset. Then there is an algorithm for the secretary problem on 
(-P, -<) that is successful with probability at least -, and this is the best possible such bound. 

We conjecture that Theorem 11.11 can be extended to all posets with k maximal elements. It 
is not inconceivable that the same algorithm works; if not, it would be good to find some other 
algorithm dependent only on k that does so. 

Conjecture 1.3. Let {P, -<) be a poset with k maximal elements. Then there is an algorithm 
for the secretary problem on (P, ~<) that is successful with probability at least pk, where pt is as 
defined in (jl.ip . 

Our algorithm, which gives the bound in Theorem II. 2 1 depends only on size of the poset and 
the number of maximal elements. In the original version of this paper we conjectured that the 
latter piece of information is not needed; a beautiful proof of this result was given around the 
same time by Preij and Wastlund [5]. 
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Theorem 1.4 (Freij and Wastlund [5j). There is an universal algorithm for the secretary prob- 
lem which is successful on every poset (P, -<) with probability at least -. 

We remark that the secretary problem on a poset with k maximal elements was also considered 
recently (and independently of this work) by Kumar, Lattanzi, Vassilvitskii and Vattani |12j . 
who obtained similar results via a different method. The poset consisting of k disjoint chains 
was also studied by Kuchta and Morayne [IT], but with a restriction on the order in which the 
elements are observed: those from the first chain all appear in a random order, then those from 
the second chain, and so on. This poset is also related to multicriteria extensions of the secretary 
problem. In the original multicriteria version, each element is ranked independently in A; > 1 
different criteria, and the selector wishes to select an element that is maximal in at least one 
of them. This is equivalent to the problem on k equally-sized disjoint chains with the elements 
appearing one at a time from each chain in the same cyclic order. This version was solved 
by Gnedin [8]. Gnedin has also produced a more general survey of multicriteria problems [9j. 

Interestingly, the asymptotic value of the probability of success in Theorem ll.il ^~\h^, is the 
same as in the multicriteria version. 

This paper is organized as follows. In Section [2l we shall introduce the formal model and 
some notation. In Section [3l we shall describe a (randomized) algorithm for choosing an element 
of our poset, and prove lower bounds for its probability of success for various families of posets. 
In Section HI we shall show that our bounds are best possible, by proving that, for the poset 
that consists of k disjoint chains of length n (which lies in each of these families), there is no 
strategy that wins with probability greater than pk + o(l) (as n -^ oo). 

2. Formal model and notation 

We begin by defining formally the probability space in which we shall work throughout the 
paper. The reader who wishes to avoid technicalities on a first reading is encouraged to skip 
this section, since all crucial definitions will be restated when used. 

Our probability space will depend on a poset (P, -<) with P = {xi, . . . ,Xn\- Let max^(P) 
denote the set of its maximal elements, that is, 

max^(P) = {x € P :^y such that x ^ y}. 

We shall suppress the subscript in max^ when it is clear from the context. 

Given (P, ^), we shall work with a probability space (Op, J^p,Pp), with Ep defined in the 
obvious way. We shall suppress the subscripts when they are clear from the context, as they 
will be for the rest of this section. We define the probability space (il, J-", P) as follows. Set 
= Sn X [0, 1], where Sn is the permutation group on [n], and F = V{Sn) x B, where B is the 
Borel (T-algebra. Let P = /i x A, where // is the uniform probability measure, that is, 

n\ 
for all a € S^, and A is the Lebesgue measure. In other words, [p^S] € ^2 is picked uniformly 
at random. Given (a, S) G il, the a-co-ordinate will determine the order in which elements of P 
appear and the (5-co-ordinate will allow us to introduce randomness independent of this order 
into our algorithms. Specifically, the J-co-ordinate will determine an initial number of elements 
to reject without considering them. The reason why we are using continuous space and Lebesgue 
measure, despite the fact that all of our randomized strategies pick one of a finite number of 
options, is that this allows them all to lie in the same probability space. 
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Write P'-^' for the set of permutations of P, and let vr : (7 — t- P'-^' be the random variable 
defined by 

Tr{a,6){i) =a;<^(j). 

Let ^t denote the set of all posets with vertex set [t] = {!,... ,t}. Let {Pt)t^[n] be a family of 
random variables with Pt representing the poset we see at time t. Formally, Pj : $7 — )• ^t and 
each Pt{cr, 6) = {[t],^t) is defined by 

\/i,j G [t],i <t j ^^ vr(i) -< tt{j). 

The poset Pt is the natural description of what we see at time t as the elements of P appear 
one by one. 

Let {J^t)te[n] be the sequence of cr-algebras with each Tt generated by the random variables 
Pi, . . . ,Pt, that is, 

Tt = aiPu...,Pt) = a{Pt), 

the second equality holding since Pt is a labelled poset and so its value determines the values 
of Pi, ... , Pf-i. We think of Tt as the information we know at time t about where we are in 
the universe Q. Since Pt takes only finitely-many values, Tt has a simple structure; it is the 
pre-images in il. of the possible values of Pt and the unions of these pre-images. We call these 
pre- images the atoms of J-^. 

Let Tl be the projection of Tt onto V{Sn)- Since our definitions have so far depended only 
on the (T-co-ordinate of (cr, 5) € fi, we see that, for each t, 

Tt = {A-K [0, 1] : A e T't). 

In other words, ((Ti,5i) and ((T2,52) are in the same atom of J-t if and only if ui and (T2 are in 
the same atom of F[, which happens if and only if the labelled posets induced by the first t 
elements vr(l), . . . ,'/r(t) are identical. 

By a stopping time, we mean a random variable r taking values in [n] and satisfying the 
property 

{r = t} G Ji, 

that is, our decision to stop at time t is based only on the values of Pi, . . . , Pj. 

We shall need to refer to conditional expectation and probability, which in the finite world 
are trivial, intuitive concepts. We define a family of random variables {Zt)te[n] by 

Zt=F[7T{t) G max(P) I Ji] , 

that is, Zt is the probability that the t^^ element observed is maximal given Pi, . . . ,Pj. Our 
aim will be to choose a stopping time r to maximize P[7r(r) € max(P)] . The value of 
P[7r(T) G max(P)] can be easily shown to be equal to E(Zt-) - see page 45 of Chow, Rob- 
bins and Siegmund [2], for example. These equivalent formulations will be useful later. 

Recall that J-"/ is the projection of J-t onto V{Sn)- By a randomized stopping time, we mean 
a random variable r taking values in [n] and satisfying the property 

{r = t} G J-; X B, 

that is, our decision to stop at time t is based on the values of Pi,...,Pj and on some B- 
measurable random variable. The randomized stopping times that we shall consider will be 
convex combinations of a finite number of true stopping times, so if such a randomized stopping 
time gives a certain probability of success, then there is a true stopping time with at least that 
probability of success. 



THE SECRETARY PROBLEM ON AN UNKNOWN POSET 



3. Lower bounds 

Throughout this section, p is a real number satisfying < p < 1. Recall that 7r(t) is the t 
element of the poset P that we see, and that Pt is a poset with vertex set [t] that is isomorphic 
to the poset seen at time t. We shall prove lower bounds for the probability of success of the 
following randomized algorithm on different families of posets. 

Algorithm. Given a poset with n elements, of which k are maximal, let X{p) ~ Bin(n,p). 
Reject the first X{p) elements and accept the first subsequent element where the following 
condition holds: the poset induced by the elements seen so far (including the currently observed 
element) has at most k maximal elements and the currently observed element is one of them. 

This algorithm gives rise to the following stopping time, Tk{p)- 
Let X{p) : ri — 7- {0, . . . , n} be the random variable defined by 

X{p){a, (5) = min J X > : ^ ("'^p\^-pT'' >A, 



so that 



F{x{p) = x) = {'^]p''{i-py 



and X{p) = X{p){a, 6) is independent of a. Then Tfc(p) is defined by 

min |t > X{p) : \ max(i-i)| < k and t G max(P()} if this exists. 



^ n otherwise. 

Given the definition of Tk{p), it makes sense to consider another random variable, the set of 
X{p) elements that we reject without considering. We denote this random variable by S{p), 
where 

S{p) = {n{t) : t < Xip)}. 

We shall make use of the following simple property of S{p), which is easily verified. 

Lemma 3.1. The events {x £ S{p)}xeP (^fc independent and P(x G S{p)) = p for all x € P. 

Proof. We can generate vr and X(p) with the required distributions in the following way. Put 
each element of P in S{p) with probability p independently of all other elements. Let vr consist of 
a uniformly random ordering of the elements of S{p) followed by a uniformly random ordering of 
P\S{p). By symmetry, vr is a uniformly random ordering of P, and X{p) = \S{p)\ is a binomial 
random variable independent of vr. The events {x € S{p)}xeP depend only on vr and X(p), and 
by construction the properties in the statement of the lemma hold. D 

We shall also use the following standard identity; for completeness we include a proof. 
Lemma 3.2. For all integers k > 1, the following holds: 



oo 



s=0 

Proof. Suppose that we have a coin that comes up heads with probability p and tails with 
probability 1 — p, and that we toss it infinitely many times. Then, with probability 1, we shall 
see at least k heads, and the k^^ head comes up in position k + s for some s > 0. In this case. 
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we know that k — 1 of the first k -\- s — 1 tosses are heads and the remaining s are tails, and so, 
summing over the probabilities that the k head comes in each position, we have 



E 

s=0 



k + s-1 
k-1 



p\i-pr = i, 



as required. 



D 



In order to prove Theorem II. H we first calculate a lower bound for the probability that Tk{p) 
is successful on the poset consisting of k disjoint chains. Recall that p is a real number satisfying 
< p < 1 and that T^{Tk{p)) is the element that the algorithm Tk{p) selects. 



mi 







® 



® ® 



Ci C2 



J t''^ 



® 



® 



Ck 



Figure 1. An example of k disjoint chains with the elements of S{p) circled. 
This illustrates an instance of the event j4o,3,...,i. The region enclosed by the 
solid curve marks the ji + . . . + jk elements that might be selected. 



Theorem 3.3. Let {P,~<) be a poset consisting of k disjoint chains. Then 

plogi ifk = l, 



IP[vr(rfe(p))Gmax(P)] > 



^p(l_pfe~i) ^^^>i^ 



Proof. We first note that TT{Tk{p)) G max(P) in the exceptional case where S{p) = P and 
7r(rfc(p)) = 7r(n) G max(P), an event with probability -p". This tends to as n ^- 00, and 
we obtain the bounds in the theorem by considering only the cases where X{p) < n and hence 
7r{Tk{p)) ^ S{p). However, when we come to the proof of Lemma [3.81 the fact that these bounds 
are for a slightly smaller event will be important. 

Let the k chains be denoted by Ci, . . . ,Ck and have lengths nii, . . . , ttt,^. Let j4j^^...Jj. be the 
event that for each i there are ji elements from Ci not in S{p) above the highest element from 
Ci in S{p) (see Figure [T]), that is, 



^n,—,jk 



l\ {\{x e Ci\S{p) -.^y eCiO S{p) such that x -<y}\ 



Ji 



For ji < mi, this means that the top ji elements are not in S{p) but the {ji + 1)*^* is. For ji = mi, 
this means that there are no elements from the i^^ chain in S{p). Note that if ^ji,...jfe occurs 
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then TT{Tk{p)) will be the first element observed from the j'l + . . . + jk elements not in S{p) that 
are at the tops of their chains. 

The events {Aj^^j^ : < ji < m,i,...,0 < jk < rrik} partition the whole space. Thus, 
writing Qk{p) for P[7r(Tfc(p)) G max(P)] , 

Qkip) = Yl H^irkip)) G max(P) | A,,_,^] • ¥[A,,_,^] 

0<ji<mi,...,0<jk<m^. 
> y^ l{^ : ji > 0}l /-^ pyi+...+.ikpUr:n<rrH}\_ /3]^N 

^-^ ?'l + . . . + jk 

(ii,-,ife)^(o,...,o) 

Since 1 + (1 — p) + (1 — p)"^ + . . . = ^, this can be written as 

Qkip) > Yl l{^ ■ ii > Oil (1 _ p).l + ...+..p^-(l + (1 _ p) + (1 _ p)2 + . . .)|{^:..=mj| 

■^ — ' 7i + . . . + 71. 

(ji,...,ifc)7^(0,.-,0) 



{ji,-Jfe)^(0,...,0) 



|{^:j^>0}| 
min{ji, mi} + . . . + mm{jk,mk} ' 






> ^ ii!-A^(l-p)^-^+-+V. (3.2) 

(ii,-jfe)^(o,...,o) 

To see the equality, simply note that the term corresponding to ( ji , . . . , j^ ) on the right-hand 
side appears on the left-hand side by choosing the term (1— p)-''"™* in the sum whenever ji > rrii. 
We now rewrite (|3.2|) as a sum over r = \{i : ji > 0}[ and s = ii + • • • + jfc, and obtain 



k oo 



Qk{p) > EEll*^-^!'---'-^'^) • K^ ••^* > ^J'l "^ and J1 + ...+ jfc = s| • -(l-p) 



r=l s=r 



\Spk_ 



The rest of the proof is a straightforward calculation. To calculate |{(ji, . . . ,jk) '■ \{i '■ ji > 
0} = r and ji + . . . + jfc = s} , we note that there are (^) ways of choosing the indices i with 
ji > and there are then (^Zj^) ways for r non-zero numbers to add up to s. Thus 

Reversing the order of summation, 

oo min{fc,s} i\ /J, i\ 

QM > ^P^E'^ii-pr E : : ■ 

The second sum is easily evaluated (a result known as Vandermonde's identity) to give 

Qk{p)>kp'Yhi-pr(''l'_~^). (3.3) 



s=l 
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Finally, let us evaluate the sum in the above equation. Write 






k + s-2 
k-1 



Differentiating, and then applying Lemma 13.21 we find that 



dp 



E(i-^r' 



k 



p 



,fe' 



We now integrate to obtain 



Vk{p) 



{k-l)p'' 



logp + ci if k = 1, 
^ — + Ck if A; > 1, 



where the Ck are constants. Since the expressions above are continuous in p in the interval (0, 1], 
we may consider limits as p — ?> 1 to find Ck and deduce that 



s=l ^ 



1 



log I 



(^ fc— 1 \p'- 
Substituting the value of Vk{p) into ()3.3p gives the result. 



if /t = 1, 
if A; > 1. 



D 



In order to extend the result above to posets whose width is the same as their number of 
maximal elements, we shall use Dilworth's theorem [3] (see also page 81 of [Ij): 

Dilworth's theorem. A poset with largest antichain of size k can be covered by k chains. 

In the next theorem, we shall show that the secretary problem is no harder on a poset with 
k maximal elements and width k than on a poset consisting of k disjoint chains. 



mi 




® (5) 



Ci C2 



Ck 



Figure 2. An example of k disjoint chains with one extra comparison, and 
with elements of S{p) circled. The region enclosed by the solid curve marks the 
elements that might be selected. The element in the dotted region could have 
been selected if the extra comparison were not there — cf. Figure [TJ 
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Theorem 3.4. Let (P, -<) be a poset with n elements. Suppose that (P, -<) has k maximal 
elem,ents and that none of its antichains has size greater than k. Then 

plogi ifk = l, 



P[^(rfe(p))Gmax(P)] 



> 



k 



k- 



jp{l — p^ ^) if k > 1. 



Proof. By Dilworth's theorem, we see that P takes the form of k chains with some comparisons 
in between them. Clearly, the k elements of max(P) lie at the top of the k chains. The proof 
therefore proceeds in an almost identical manner to that of Theorem 13. 3i The only difference is 
that the denominator in each term of (|3.ip is now at most, rather than equal to, ji + . . . + j^ 
(see Figure [2|), so the expression in this line is still a lower bound. The calculations that make 
up the remainder of the proof of Theorem 13.31 therefore follow in the same way. D 



The values that maximize the function in Theorem 13.41 are 

i if A; = 1, 



e 



Pfc = .-./T :r.. . (3-4) 



i ifA:>l. 

This gives us the following corollary and the lower bounds in Theorem ll.il 

Corollary 3.5. Let (P, -<) be a poset with n elements. Suppose that (P, -<) has k maximal 
elements and that none of its antichains has size greater than k. Then 

F[vr(rfc(pfc)) Gmax(P)] > p^. 

It is interesting to note that the expected proportion of elements that we reject without 
considering is the same as the probability of success. 

We now wish to prove the following theorem, which, with the right choice of p, will give us a 
lower bound of - for all posets, as in Theorem II. 2[ 

Theorem 3.6. Let (P, -<) be a poset with n elements. Suppose that (P, -<) has k maximal 
elements. Then 

P[7r(rfe(p)) G max(P)] > kp^log-. 

The proof will use two simple lemmas. The first states that the linear order is the hardest of 
all posets with a unique maximal element. 

Lemma 3.7. Let (P, -<) be a poset with n elements. Suppose that (P, -<) has exactly one maximal 
element. Then the probability that Ti{p) is successful on (P, ^) is at least the probability that it 
is successful on a linear ordering of P, and hence 

^Winip)) G max(P)l > plog -. 

Proof. We begin by taking an arbitrary linear extension of -<, that is, a partial order -<' such 
that any two elements are comparable and such that x ~< y =;> x <' y. (It is clear that such a 
partial order exists.) We denote the unique element in max^(P) = max^/(P) by Xmax- 

Given this new poset, (P, -<'), we define random variables vr', X'{p), S'{p) and t[{p) in the 
same way as vr, X(p), S{p) and Ti{p) were defined given (P, -<). We couple the random variables 
{■K.,X{p),S{p),Ti{p)) and {t[' ,X'{p),S'{p),T[{p)) in the obvious way; we set vr' = vr and X'{p) = 
X{p), and hence S'{p) = S{p). This means that the elements appear in the same order in both 
instances, and the same set S{p) is rejected in both cases. The induced posets observed in the 
process on (P, ^') are linear extensions of those observed in the process on (P, ^). We show 
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that if 7r(ri(p)) 7^ a^max then tt'{t[{p)) ^ Xmux, that is, if ri(p) fails in the process on (P, -<) then 
t[{p) fails on (P, ^'). From this, the result follows, since the probability of success is therefore at 
least as large on (P, ^) as on (P, -<'), and Theorem 13.31 applied to (P, -<') gives the lower bound. 
If we reach a^max then it will be accepted, since it must be the unique maximal element in the 
poset induced by the elements observed so far. Thus vr(ri(p)) 7^ Xmax if either 

(i) a^max G S{p) or 
(ii) after rejecting S{p), we accept an element that appears earlier than Xmax- 

In case ^, t[{p) must fail on (P, -<') for the same reason, since S'(j)) = S{p). In case ([n]), such 
an element must be the unique maximal element of the poset induced by what we have seen so 
far, and this is still the case in any linear extension. Therefore, with t[{p), if this element is 
observed then it must be accepted, and so we still accept an element that appears earlier than 
a^max- It follows that in either case 7r'(r{(p)) 7^ a;max- D 

The next lemma gives a lower bound for the probability of success restricting our attention 
to the case when all but one of the maximal elements of our poset are in S{p). This turns out 
to be enough to prove Theorem 13.61 



Lemma 3.8. Let (P, -<) be a poset with n elements. Suppose that (P, -<) has k maximal elem,ents. 
Then 

^Winip)) G max(P) I I max(P) n S(p)\ = k - l] > -^ log -. 
I ^ \ — p p 

Proof. We first observe that we may assume that A; = 1, for the following reason. The condition 
that I max(P) n S{p)\ = k — 1 means that the k — 1 maximal elements in max(P) n S{p) will 
be maximal for the remainder of the process, so when using Tk(p) we may ignore these and 
all elements dominated by at least one of these, and wait for a unique maximal element from 
the remaining elements. Those elements form a poset (P', -<') with a unique maximal element 
a^max; which is not in S{p). Since all elements are in S{p) with probability p independently of 
the others, the situation is the same as if we were working with (P', -<') and conditioning on 
a^max ^ S{p). Looking for one of at most k maximal elements in P using Tk{p) is the same as 
looking for a unique maximal element in P' using Ti{p). 

We assume from now on that k = 1; we shall use Lemma 13.71 to prove the result in this case. 
Lemma [3.71 used the bound from Theorem l3.3[ and we recall from the proof of that theorem that 
the lower bound for F[tt{ti{p)) = Xmax] is in fact a lower bound for P[(S'(p) 7^ P) A (7r(ri(p)) = 

a^maxjj • 

Let us write M for the event that Xmax G S{p) and W for the event {S{p) 7^ P) A (7r(ri(p)) = 
a^max)- We note that if S{p) 7^ P and a;niax G S{p) then 7r(Ti(p)) 7^ Xmax and hence W = W /\M'^. 
Therefore, 

P(VF) = ¥{W A M") = P(M^)P(Ty I M") = (1 - p)P(Ty I M"). 
Since, by the bound from Lemma |3.7[ 

¥{W) >plog-, 
p 

and the quantity that we are interested in is P(M^ j IvP), the result follows. D 

We are now in a position to prove our theorem. 
Proof of Theorem \3.(A We have 

P[|max(P)n5(p)[ = k-l] =kp^-^{l-p). 
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Thus, for general k, 








^[■^{n{p)) G max(P)] > P 


7T{Tk{p)) G max(P) 1 max(P) n S{p)\ = k-l 




X P 


\max{P)nS{p)\ = k-l 


>r 


^ log ^ • kp^-\l p 
-p p 


) = kp'' log-, 
p 





as required. D 

This gives us the following corollary. The probability e~fc is chosen to maximize the function 
in Theorem l3.6l and gives the lower bound in Theorem ll.2[ As mentioned earlier, it is well-known 
that - is the best possible lower bound for the probability of success in the classical secretary 
problem, and so this completes Theorem 11.21 



Corollary 3.9. Let (P, -<) be a poset with n elements of which k are maximal. Then 

1 



■^[Tkie fe G max(P) 



4. Upper bound 



> -. 

e 



In this section, we show that the bound in Corollary 13.51 is best possible. The proof of 
Theorem 13.31 shows that the probability of success of the stopping time Tk{pk) on k disjoint 
chains decreases towards the given lower bound as the chains increase in length. This might 
suggest that the probability of success of an optimal strategy is reduced as the chains increase 
in length and thus, to prove that the bounds are best possible, we would consider chains with 
length tending to infinity. The main theorem in this section. Theorem 14.11 does just that; for 
sufficiently long chains, the probability of success of an optimal stopping time can be made 
arbitrarily close to that in Corollary 13.51 and so Tk{pk) is asymptotically optimal. Since the 
poset consisting of k disjoint chains satisfies the conditions of Corollary 13.51 the bounds given 
are the best possible such bounds. 

We define Dk{x) to be the poset consisting of k disjoint chains, each of size x. It might be 
useful at this point to recall some definitions from Section [2J The probability space associated 
with the poset -Dfc(x) is denoted by (O^)^^^), J^^^j-^), P^j^j-^)), but we suppress the subscripts when 
they are clear from the context. The poset induced by the first t elements that we observe is 
isomorphic to the random variable Pt, which is a poset on vertex set [i], and J-t is the cr-algebra 
generated by Pi, . . . , Pt, which represents what we know at time t. A stopping time is a random 
variable r taking values in [n] and satisfying the property 

We shall use the notation C/jr^) to denote the class of all such stopping times, and extend this 
notation to any sequence of ci-algebras in the analogous way. 

We are trying to find an upper bound for E(Zt-) that holds for all stopping times r, where 

Zt = P[7r(t) emax(P)fc(x))|J-i]. 

Theorem 14.11 states that, as x ^- oo, the limit of the probability of success of the optimal 
stopping time on D^ (x) is no greater than p^ . Since Corollary 13.51 showed the existence of a 
stopping time with probability of success at least p^, Theorem 14.11 shows that this is the best 
possible such bound and so gives Theorem ll.il 
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Theorem 4.1. Let pk be as defined in (jl.ip . Then 

lim sup Ed M{Z-r) < Pk- 

Note that the supremuni is over stopping times in C(jr\, which means that we are allowed 
to use the extra information from the structure of the posets, not just the pay-offs that we are 
offered. 

The following observation is important, so we record it as a lemma. 

Lemma 4.2. When (P, -<) = Dk{x), we have 

( y/x if IT (t) emax{Pt),TT{t) eC and \C n{TT {!),... ,TT{t)}\= y, 

^*~| ^/7^(t)0max(Pt), 

where C is one of the k chains in Dk{x). 

Proof. The maximal element of a chain C is equally likely to be at any position in the order in 
which its X elements are observed. Therefore, when y elements have been observed from this 
chain, the probability that one of them is the maximal element is -, independent of the most 
recently observed element being maximal. D 

The proof of Theorem 14. 1 1 will proceed roughly as follows. At time t we expect to have seen 
approximately | elements from each chain. Therefore, since all orders are equally likely, the t*'^ 
element that we observe is maximal in what we have seen so far with probability approximately 
jj^ = J. By Lemma [4.21 if this happens then it is a maximal element of Dk{x) with probability 

approximately ^. We conclude that Zt is approximately distributed as 

{-f- with probability -r, , ^ 

kx t' -^ *' (4.1) 

with probability 1 — f • 

If Zt were distributed exactly like this with the Zt all independent of each other, then the 
proof would not be difficult to complete. Since the potential non-zero value of Zt increases 
with t, it is straightforward to show (as in the classical secretary problem; more details will 
be given later in this section) that the optimal strategy is to ignore the first I elements and 
accept the next non-zero Zt. Let us denote the associated stopping time by r/ and make some 
rough calculations. This is only an outline of the more precise argument that will be given 
later; in particular, ~ is only intended to have an intuitive meaning and does not stand for any 
well-defined relation. We find that 

= t)E{Zt \Ti = t) 

t 




ViG{/+l,...,t-l}) A(Zi>0) 



kx 
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We shall apply this formula in the case where k is much smaller than /, so we can approximate 

7 k 

1 ~ 7 by e~^ , and twice approximate sums by integrals to obtain 



t=I + l t=I+l 



_I^^ ( i.log(f) ifk = l, 

This is the formula in Theorem 13.41 with P = ^ and is thus maximized, as in Corollary 13. 5| 
when ^ = Pk, in which case K{Zt-j) ~ pk- Therefore the bounds in Corollary 13.51 are best 
possible, and if these calculations had been exact then the proof of Theorem 11.11 would be 
complete. 

Unfortunately, Zt is not distributed exactly as in (j4.ip . In order to conclude that the optimal 
stopping time is of the simple form above, we should like to use the principle of backward 
induction, described later (see also |2j, Theorem 3.2). This formalizes the intuitive principle 
that, in a finite game, the optimal strategy is simply to analyse at each step whether or not we 
expect our situation to improve by continuing, and to do so if and only if this is the case. 

The reason why the sequence of random variables (Zi)jgr„i is difficult to analyse is that the 
values they can take vary depending on how the process unfolds. However, it is very likely that 
at any time we shall have seen approximately the same number of elements from each chain. 
The proof will therefore proceed by defining a sequence of random variables (^)te[n]) which act 
as asymptotically almost sure upper bounds for Zt and are easier to analyse. To obtain these 
bounds, we shall split each chain into m segments, each of length i, and split the process into m 
sets of ki observations. These lengths i are margins of error beyond which we do not expect the 
number of elements observed from a chain to deviate. Initially, we shall fix m and let £ — )• oo to 
find an upper bound for E(1V) and hence E(Zt-) in terms of m. Letting m — )■ oo will then give 
us a best possible result. 

This means that the precise statement we shall prove for Theorem 14.11 is in fact 

lim lim sup Ejja^){Zr)<Pk- 

However, this is purely a matter of convenience; it is clear that the proof can be extended to 
posets Dj^{x) where x is not a multiple of m by dividing each chain into m almost equal rather 
than exactly equal segments. 

We need to show that the process behaves in this approximately uniform manner with high 
probability as £ — )• cx). We shall first define what it means to be approximately uniform in one 
particular chain C at time kis, an event we call Uc,s (see Figure [3]), and then what it means to 
be approximately uniform everywhere at all times, an event we call U . 

Given one of the chains, C, and for all s € {0, . . . ,m.}, let Uc,s be the event that when we 
have observed kls elements in total we have observed between l{s — 1) and (.{s + 1) elements 
from C, that is, 

Uc,s = {^{s-l)< |Cn{7r(l),...,^(A:£s)}| <^(s + l)}. 

For all t G {0, . . . , kim}, let s{t) be the unique integer s such that kl(s — 1) < t < kis, that is, 

t 



s{t) 



k£ 



(4.2) 
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(s + 1) 

is 

Is-V) 



im 



Ci C2 



Ck 



Figure 3. This figure shows the number of elements observed from each chain. 
In this example, after a total of kis elements have been observed we see that 
C/c'i,s and Ucj.,s hold but Uc2,s does not. 

Let U be the event that for all t when we have observed t elements in total we have observed 
between i{s{t) — 2) and i{s{t) + 1) elements from each chain, that is, 

U = f] {e{s{t) - 2) < \Q n {7r(l), . . . , 7r(t)}! < ^(^(t) + 1)}. 

i,t 

We shall use Lemma |4.4| which follows easily from Lemma 14.31 It states that the process is 
approximately uniform with high probability. 

Lemma 4.3. Let m > 1 be an integer, let C be one of the chains in Dk{im) and let s G 
{0, . . . ,m}. Then 

lim fDk{em){Uc,s) = 1- 

Proof. We show that the probability that we have observed more than i{s + 1) or fewer than 
i{s — 1) elements tends to zero as £ ^- 00. (If s = or s = ?7i then we need consider only one of 
these tails.) 

Assume C and s are given. Let N be the number of elements we have observed from chain C 
when we have observed kis elements in total. It is straightforward to check that 



F{N = x) 



\ X )\ kls-x ) 



\kis) 



is increasing for x < is and decreasing for x > is, and that 

\N = x + l) 



F{N = x) 



> c> 



if X ^ [i{s - l),i{s + 1)], for some absolute c> 0. It follows that F{N ^ [i{s - l),i{s + 1)]) = 
O {\) ^ as i ^ 00. D 

Lemma 4.4. Let m > 1 be an integer. Then 



lim ¥D,iem){U) = 1. 



THE SECRETARY PROBLEM ON AN UNKNOWN POSET 15 

Proof. This lemma follows simply from the previous lemma: choose i sufficiently large that each 
of the k{m + 1) events Uci,s occurs with probability at least 1 — S. Then, trivially, all k{m + 1) 
events hold with probability at least 1 — k{m + 1)5. It is easy to see that 

n uc.^s c u, 

C,,s 

since the events Uc^,s{t)-i and Uc^^s(t) imply that 

^{s{t) - 2) < \C, n {7r(l), . . . , Ti{t)}\ < i{.s{t) + 1), 
and so this holds for every t and i, as required. D 

The next lemma states that in order to prove Theorem 14 .11 it suffices to show that its statement 
is true if we condition on U holding. This formalizes the intuition that, since the process is 
asymptotically almost surely uniform (that is, since lini£^oo^D^:{em){U) — )• 1), we may assume 
that it is uniform. 

Recall that C(^j-^-^ is the class of all stopping times relative to the u-algebras Tt = (t(Pi, . . . , Pt), 
that is, the decision to stop at time t depends only on Pi, ..., Pf 

Lemma 4.5. For all m, 

lim sup E^, (^„)(Z^) < lim sup E^, o„)(Z^ 1 1/). 

Proof. By Lemma|131 for all e > we may choose i sufficiently large that '^DMm){U'^) < £■ We 
also know that Zt < 1 for all t. Therefore, for all r, 

^Dk{em){Zr) = E£)^(^„)(Z^ [ U)Fj;)^(^ijn){U) + '&Dk{lm){ZT \ V^y^ Dk{lm){U'') 

We now take suprema to obtain 

sup ^Dk{lm){ZT) < sup ^Dk(tni){ZT\U) + e. 

Since e is arbitrary, the result follows. D 

Next, we define the random variables Yj that act as upper bounds for the Zt and are easier 
to analyse. These random variables are not strict upper bounds, in the sense that the random 
variables are not coupled in any way. However, conditioned on U occurring, Zt is less than the 
potential non-zero value of Yj and the probability that Zt is non-zero is less than the probability 
that Yt is non-zero, and we shall be able to show that the optimal strategy for the game on Zt 
has a lower expected pay-off than the optimal strategy for the game on Yt. 

We shall often need to refer to the potential non-zero value of Yj and the probability that it 
takes this value, so let 

.. = ^^^^ and vt = \'^^ '''^'^-'^ (4.3) 

m \ 1 ifs(t)<2, ^ ' 

where s{t) is as defined in ()4.2p . We now define a sequence of independent random variables 

{Yt)t&[n\ by 

yt with probability pt, 



' with probability 1 — pt. 
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We shall not define these explicitly on any probability space as there is no need to do so, although 
it is of course straightforward to do it on ri. 

The next lemma states the intuitive principle that we expect to do at least as well in the 
game with the random variables Yt as in the game with the random variables Zt conditioned on 
U occurring. Analagously to Tt for Zt, let {Gt)te[n] be defined by Gt = o'iYi, . . . ,lf). 

Lemma 4.6. For all £, m, S N with m > 3, 

sup ED^(^irn){ZT\U) < SUp E£,^((,^) (Y^). 

Putting Lemmas 14.51 and 14.61 together tells us that 

lim sup E^, (^„)(Z^) < lim sup E^, (^„)(y^). 

In order to prove this lemma, we need a precise statement of what backward induction tells 
us is the optimal stopping time in a finite process. We define a new random variable for each t, 
the value of the game at time t. This is the expected pay-off ultimately accepted given what has 
happened so far. We calculate these values inductively, starting at the end. The value of the 
game at the final step is just the final pay-off offered. The value of the game at each earlier step 
is the maximum of the currently offered pay-off and the expected value of the game at the next 
step. The optimal strategy is to stop when the currently offered pay-off is at least the expected 
value at the next step. 

In the backward induction theorem below, the pay-offs offered are the Wj and the values at 
each step are the jf The u-algebras At represent what we know at time t. We remind the 
reader that being ^^.^-measurable means that cr{Wt) C At, that is, the value of Wt is determined 
by what we know at time t or, in the finite world, Wt is constant on each atom of At- In fact, 
the nested condition means that At D cr{Wi, . . . , Wt). The statement of the theorem is that the 
strategy that stops at the first t when Wt = 7t (or, equivalently, when Wt is at least as large as 
the expected value of ^t+i given At) is indeed a stopping time and achieves the optimal value. 
For more details, see [21 Theorem 3.2]. 

Backward induction. Let ^i C . . . C An be a nested sequence ofa-algebras and let Wi, . . . , Wn 
be a sequence of random variables with each Wt being At-measurable. Let C(^j) be the class of 
stopping times relative to {At)t£[n] o-nd let v* be given by 

V* = sup E{Wr). 

Define successively 7„,7„_i, ... ,71 by setting 

In = Wn, 

-ft = ma.x{Wt, E{jt+i\At)}, t = n-l,...,l. 

Let 

T* = mm{t ■.Wt=jt}- 
Then r* E C(^j) and 

E{Wr*) = E(7i) =v* > E{Wr) for all r € C(^,). 

This theorem provides the machinery we need to prove our lemma. 
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Proof of Lemma \4.6[ We shall apply backward induction to the sequences Yt and Zj, conditioned 
on U occurring, to show that the optimal expected pay-off for the Yt is at least as large as for 
the Zf. For convenience, we shall continue to use n to denote the number of elements in Dk{im), 
that is, n = kim. 

Let us first consider what happens with the sequence Yj. Recall that {Qt)t€[n] are defined by 
Gt = cr(Yi, ...,Yt) and let («*)*£[„] be defined for (yt)tg[„] as (7t)tG[n] were for iWt)te[n] in the 
backward induction theorem, that is, 

at = max{yi,E(af+i \Gt)}, t = n-l,...,l. 

Since the random variables Yj are independent, the values of Yi, . . . , Ij give no information about 
the values of It+i, . . . ,Yn, and therefore E(at+i | Gt) is constant on Gt and equal to E(at+i). 
Therefore we may define the function v : [n] — t- M by 

v{t)=E{at) 

and note that backward induction tells us that the stopping time that stops at the first t such 
that Yt > E(at+i | Gt) = v{t + 1) is optimal. By definition, 

v{t) =E{ma^{Yt,v{t + 1)}) >v{t + l), 

whereas yt, the potential non-zero value of 1^, is a non-decreasing function of t. We conclude 
that there exists / such that 

yt < v{t + 1) if t < /, 

yt>v{t + l) if t > /, ^ ■ ^ 

and therefore an optimal strategy for the game on Yt is 'reject the first / elements, and accept 
the next with a non-zero pay-off.' 

Recall that the distribution of Yt is given by 

y _ ( yt with probability pt, 

\ with probability 1 — pt. 

We deduce that, since m > 3 and s{n) = m, 

vin) = E(«„) = E(y„) = p„,„ = ^ . ^jJ-^ >±='- (4^5) 

and that, for 1 < t < n — 1, 

v{t) = E{at) =E{max{Yt,E{at+i)}) = < v(t + l) if t < / ^^^ 

We now turn our attention to the sequence of random variables Zt- Since we are conditioning 
on U, we introduce a new sequence of cr-algebras Ht defined by 

nt = oiTt u {[/}) 

and shall consider only uj £ U. Analogously to jt and at, let 

Pn = Zn 

and, for 1 < t < n — 1, let 

A = max{Zi,E(A+i|?^i)}. 
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Recalling that Cf^jr^^ denotes the class of stopping times relative to Ft, observe that we have 
Ft '^ 'Ht, and hence C(j-j) C C(^j) and 

sup E£,^(^„)(Z^It/) < sup E£,^(^„)(Z^IC/). 

Note that intuitively this is obvious: it just says that having extra information (that the event 
U occurs) can only help us in choosing our stopping time r. 

Recall that ]E(/3t | T-Lt~i){uj) denotes the expected value of the game at time t, if we have so 
far seen the first t — 1 elements of P and are told whether or not U holds (that is, whether or 
not iv £ U). The lemma follows from the following claim by the Backward Induction Theorem. 

Claim. For all uj ^U and for all t £ [n], 

E{l3t\nt-i){io)<v{t), 

where Ho = {9, U,^,^} and so E{Pi\no){uj) =E{l3i\U). 

Proof of claim. We shall prove the claim by induction on n — t, using (j4.5p and (|4.6p . First, 
recall that E(/3„ | 7in-i){u;) is just the probability that the final element of u is maximal in P. 
Thus, by g3]), 

E{f3n\na-i) = -<v{n), 
n 

which proves the case n — t = 0. 

So let 1 < t < n — 1, and assume that the result holds for t + 1. We claim that 



and hence that E(/3i | T-it-i){co) < v{t), by 

In order to prove (j4.7|) . let cj G [/ and consider the atom A € Ht-i which contains co. 
Partitioning the space according to whether or not Zt > E(/3t-|_i | Tit), we obtain 



E{Pt\nt-i){io) = E{(3t\A) = ¥i^Zt>E{l3t+i\nt)\AJ • E^Z* | (Z* > E(/3i+i | ?^i)) n ^^ 

+ p(Zi < E{pt+i I nt) I a) • e(e(A+i I Ht) I {Zt < E{f3t+i \ Ut)) n a) , (4.8) 

since if Zt > E(/3i_|_i | Tit) then the payoff is Zt, and otherwise the payoff is E{f3t+i \ 'Ht) ■ 
Now, by the induction hypothesis we have 

E{Pt+i\nt)iu:')<v{t + l) 

for every w' G {Zt < E(/3t+i | Tit)) n A, and therefore 

E(E(/3i+i I nt) I {Zt < E(/3t+i I Ut)) n a) < v{t + 1). (4.9) 

Moreover, by Lemma 14.21 and ()4.3p . and since uj £U, we have 

Zt{u) < yt and ¥{Zt > 0\nt-i){uj) < pf 

Hence 

E(Zt I {Zt > E(A+i I -Ht)) nAJ< yt, (4.10) 

and 

¥{Zt<E{Pt+i\nt)\A)>l-pt. (4.11) 
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Finally, recall that / was chosen so that yt < v{t + 1) if and only if i < /. Now (|4.7p follows 
easily from (|4.8p . ()4.9p . ()4.10p and ()4.1ip . This completes the induction step, and hence proves 
the claim. D 

The lemma follows from the claim, since, by the backward induction theorem, we have 

sup En,iim)iZr I U) = E(/3i | U) < t;(l) = E(ai) = sup ED,(em){Yr), 

as required. D 

In the final lemma before the proof of Theorem 14.11 we use backward induction to show that 
an optimal stopping time for the process with the Yj takes the simple form of rejecting the first 
kiu* elements, for some integer u* , and accepting the next non-zero pay-off. 

Lemma 4.7. For u* € {0, . . . , m — 1}, let 

min {t > kiu* : Yj > 0} if this exists, 
'u' — ■\ 

Then 



Til* — 1 ,1 ■ 

n otherwise. 



sup E£,^(^^)(y^) = sup E£,^(^„)(y^^.) 

T^Ci^Q^) u*g{0,...,m-l} 

Proof. We have already shown in the proof of Lemma 14.61 that an optimal strategy takes the 
form 'ignore the first /, and accept the next non-zero pay-off.' In fact, we see that / must be a 
multiple of ki: suppose, for contradiction, that / = kiu* -\- r, where r £ {1, . . . ,ki — 1}. Then 
s{I) = s{I + 1) by (USD; recall from ([O]) that yt = (s(t) + l)/m < v{t + 1) if and only if i < /. 
It follows that 

m m 

since we would be willing to stop at time / + 1. But then, by ()4.6p . 

u* + 2 s{I) + 1 



V 



(/+!) <max{y7+i,«(/ + 2)} 



m m 

Thus we would in fact be willing to stop at time /, which is a contradiction. D 

We are now in a position to complete the proof of the main theorem in this section. 

Proof of Theorem \4.1\ All that remains is to calculate and maximize E(Yt-^, ) over u*, where Tu* 
is the smallest t > kiu* such that It > 0. These calculations are very similar to those on page 
[T3| but also include error terms which tend to zero as ^, m ^ oo. 

We shall assume first that li* — > oo as m — > oo, and deduce from our calculation that this is 
a valid assumption. Indeed, if t = o{n) then yt = o(l), whereas we shall obtain a probability of 
success that is separated from zero. We should therefore never accept a payoff for t = o{n). In 
particular, for sufficiently large m we have u* > 3, and so 

m-l ki 

^ [^■^u*) ~ Z^ / ^P(y"H^*+l = 0, . . . ,Ykeu+r-l = 0, Ifcte+r > 0)yklu+r- 
u=u* r=l 
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Recall that the Yt are independent, and that s{kiu + h) = n + 1 when 1 < h < ki. Hence, using 
the convention that the empty product takes the value 1, we obtain 

m-1 I u / ^ \kl\ ,0 

1 \ \ n + 2 



E n 1 



_ ^ !(q-2)J m 

u=u* \q=u* + l ^ ' ' I 



« + 1 






U + 2 / -TT / -, 1 \ \ m + 1 



n u 



m I -, V Hq — '^)J I ITT- 



m-1 j u , -, \ ke 

+ E n 1 



1 

* , , , V i(q-2)J j m' 

u=u* + l \q=u* + l ^ ^^ 'y I 



Now, let e > be arbitrary, choose m = m{e) and H. = i{m, e) sufficiently large, and recall 
that therefore u* = u*{e) may be chosen to be sufficiently large also. Since (l — -) < e < 

(1 — i) for all n > 2, we have 



> 



j„('-%^) ^"H"(^lJ^«^^'^^'"^ ' 



and similarly 

n (^-luh-.V <-M->= E 

=u*+l ^ 

Setting p = ^, we obtain 



< 



_, Hq-2)J - ' \ ^^ a-2 I \ u 



!p-p + plogl+e ifA; = l, 

p-p +T-^^[l-p )+£ ifA;>l. 

As before, these expressions are maximized when p = Pk, and thus 

lim lim Efy,- ,) < pj^ + s. 

Since e > was arbitrary, this completes the proof. D 



Putting Corollarv 13.51 and Theorem 14.11 together gives Theorem 11.1 
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